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Abstract 

We present a simple and practical (1 + ^-approximation algorithm for the Frechet distance 
between two polygonal curves in TR d . To analyze this algorithm we introduce a new realistic 
family of curves, c-packed curves, that is closed under simplification. We believe the notion 
of c-packed curves to be of independent interest. We show that our algorithm has near linear 
running time for c-packed polygonal curves, and similar results for other input models, such as 
low density polygonal curves. 

1 Introduction 

Comparing geometric shapes is a task that arises in a wide arena of applications. The Frechet 
distance and its variants have been used, to this end, to compare curves in applications such as dy- 



namic time- warping |KP99j . speech recognition KHM + 98 , signature and handwriting recognition 
|MP99l ISKB07| . matching of time series in databases |KKS05| . as well as geographic applications, 
such as map-matching of vehicle tracking data [BPSW05] IWSP06] . and moving objects analysis 
|BBG08aLlBBfl+08b| . 

Informally, the Frechet distance between two curves is the maximum dis- 
tance a point on the first curve has to travel as this curve is being continu- 
ously deformed into the second curve, see Section [2~2l for the formal definition. 
Unlike the Hausdorff distance, which is solely based on nearest neighbor dis- 
tances between points on the curves, the Frechet distance requires continuous 
and order-preserving assignments of points and hence is better suited for comparing curves with 
respect to their intrinsic structure. 




The Frechet distance between two curves might be arbitrarily larger than their Haus- 
dorff distance, as demonstrated by the figure on the left, and as this example shows, it 
seems to be a more natural measure of similarity between curves. 
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Previous results. For two polygonal curves of total complexity n in the plane, their 
Frechet distance can be computed in 0(n 2 log re) time |AG95| . and their Hausdorff distance can be 
computed in O(relogn) time [Alt09j . It has been an open problem to find a subquadratic algorithm 
for computing the Frechet distance for two curves. For the problem of deciding whether the Frechet 
distance between two curves is smaller or equal a given value a lower bound of J7(n log n) was given 
by [BBK + 07| . Recently, Alt [Alt09j conjectured that the decision problem may be 3SUM-hard. 
The only subquadratic algorithms known are for quite restricted classes of curves such as for closed 
convex curves and for K-bounded curves [AKW04] . For a curve to be ^-bounded means, roughly, 
that for any two points on the curve the portion of the curve in between them cannot be further 
away from either point than k/2 times the distance between the two points. For closed convex 
curves the Frechet distance equals the Hausdorff distance and for K-bounded curves the Frechet 
distance is at most (1 + k) times the Hausdorff distance, and hence the 0(n log n) algorithm for 
the Hausdorff distance applies. 

Aronov et al. [AHK + 0o] provided a near linear time (1 + e)-approximation algorithm for the 
discrete Frechet distance, which only considers distances between vertices of the curves. Their 
algorithm works for backbone curves, which are used to model protein backbones in molecular 
biology. Backbone curves are required to have, roughly, unit edge length and a minimal distance 
between any pair of vertices. They use curve simplification to speed up their algorithm. Agarwal 
et al. [AHMW05J studied fast simplification that preserves the Frechet distance. 

The input model. We introduce a new class of curves, called c-packed curves, for which we 
can approximate the Frechet distance quickly, given that the constant c is small. Intuitively, 
the constant c measures how "unrealistic" the input is. We compare this new input model to 
previous models such as fatness and low density, as well as K-boundedness. These so-called realistic 
input models are commonly used for the analysis of problems where the worst case complexity is 
dominated by degenerate or contrived configurations which are highly unlikely to occur in practice, 
see |dBKSV02] for an overview. 

A curve ir is c-packed if the total length of tt inside any ball is bounded by c times the radius 
of the ball. A K-bounded curve might have arbitrary length while maintaining a finite diameter, 
and as such may not be c-packed, see Section 14.31 But unlike K-bounded curves, the Frechet 
distance between two c-packed curves might be arbitrarily larger than their Hausdorff distance. 
Indeed, c-packed curves are considerably more general and a more natural family of curves. For 
example, a c-packed curve might self cross and revisit the same location several times, and the 
class of c-packed curves is closed under concatenation, none of which is true for K-bounded curves. 
Intuitively, c-packed curves behave reasonably in any resolution. 

See the figure on the right for a few examples of c-packed 
curves. The boundary of convex polygons, algebraic curves 
of bounded maximum degree, the boundary of (a, /3)-covered 
shapes [Efr05], and the boundary of 7-fat shapes |dB08| are 
all c-packed. Indeed, the boundaries of (a, /3)-covered shapes 
and 7-fat shapes are assumed to be formed by a constant 
number of algebraic curves of bounded maximum degree. If 
one removes the requirement that a 7-fat curve be of bounded 
descriptive complexity, then also fractal curves, like the Koch's snowflake, which can have infinite 
length within a bounded area, can be fat [BCD11| . Naturally, these curves cannot be c-packed. 
Interestingly, one can show that (a, /3)-covered polygons are c-packed even if they have unbounded 
complexity, see Appendix El and also the result of Bose et al. jBCDllj . 
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Curves type 


Running time 


See 


c-packed 


0(cn/e + cn log n) 


Theorem 1431 


/•t-straight 


Same as 2«-packed 


Lemma 14. 161 


ic nnnnntiH 


(^) ( ( K 1 ^\^Tt -X- K^fl locr Ti \ 
\ V / J ' ' " ] 
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0(l)-low density 


0l n 2 +71^-^1 d \ogn\ 


Theorem 14.141 


c-packed & closed 


0(c 2 n/e 2 + c 2 n log n) 


Theorem 15.51 



Table 1: Summary of new results for computing a (1 + e)-approximation to the Frechet distance 
between two curves ir and a with n vertices in JR d . 

It is easy to verify that c-packed curves are also low density [dBKSV02], but a low density curve 
might not be c-packed, for any bounded c, see Section [4T2l However, the class of c-packed curves is 
closed under simplification, see Lemma 14.31 an d this is not true for low density curves. 

Our results. We present a new algorithm for computing a (1 + e)-approximation of the Frechet 
distance for polygonal curves in JR d . Underlying the algorithm are several new insights. First, we 
use the idea of curve simplification to reduce the complexity of the free space diagram, as this sim- 
plification results in a contraction of the corresponding rows or columns in the free space diagram. 
We introduce the notion of relative free space complexity in Definition 13. 31 to capture the complexity 
of the free space diagram of two curves, which are simplified to the appropriate resolution. Sur- 
prisingly, without simplification, almost any two curves from natural families of curves can have 
a free space diagram for the value realizing the Frechet distance that has quadratic complexity 
(even in the plane). Secondly, we present an efficient construction algorithm for this reduced size 
free space diagram that enables us to solve the decision problem in linear time in the relative free 
space complexity of the curves. Thirdly, we prove that monotonicity events are sufficiently close to 
vertex-edge events or an approximate distance between two vertices of the curves. Therefore, the 
search for the Frechet distance can be done efficiently without using parametric search or random 
sampling, by using approximate distance selection. Carefully combining these insights yields the 
new algorithm, which has running time near linear in the relative free space complexity of the input 
curves. 

In the second part of the paper, we analyze the relative free space complexity for various families 
of curves. We prove that c-packed curves have linear relative free space complexity for fixed c and 
e. We next prove a subquadratic bound on the relative complexity of the free space of low density 
curves. This relies on a new packing lemma showing that, if the simplification of a low density 
curve is long inside a relatively small area, then the original curve must contain many vertices in 
the vicinity of this region. We also prove that the relative free space complexity of K-bounded 
curves is linear for a fixed k, which leads to an improvement of the result by Alt et al. |AKW04"j . 

These bounds imply that the approximation algorithm provides fast approximation for the 
Frechet distance for all these types of curves. We also show how to adapt our algorithm to handle 
closed curves. The new results are summarized in Table [TJ 

Organization. In Section [21 we provide some background on the Frechet distance and the no- 
tion of the free space diagram. In Section [3l we describe the approximation algorithm that uses 
simplification. To this end, we show in Section 13.11 that it suffices to only compute the reachable 
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parts of the free space diagram and in Section 13.21 we present a fuzzy decider procedure and show 
how it can be used to make exact decisions during a binary search for the Frechet distance. In 
Section 13.31 we deal with the different subroutines used in the search for the Frechet distance and 
in Section T3.4I we give the resulting general algorithm and analyze its correctness and running time, 
which is near linear in the relative free space complexity. In Section [H we bound the relative free 
space complexity of various families of curves. In particular, in Section r4.lt we introduce the notion 
of c-packed curves, and study their behavior under simplification. In Section 14.31 we bound the 
relative free space complexity of K-bounded curves, and in Section ^. 2l we handle low density curves. 
In Section we extend the algorithm to closed curves. We conclude with discussion in Section [6l 

2 Preliminaries 

2.1 Notations and Definitions 

Let 7r be a curve in R d ; that is, a continuous mapping from [0, 1] to IR d . In the following, we will 
identify tt with its range 7r([0, 1]) C M. d if it is clear from the context. The curve tt is closed if 
7r(0) = 7r(l). We use ||-|| to denote the Euclidean distance as well as the length of a curve. For a 
polygonal curve tt, let V(tt) denote the set of vertices of tt. For two points p and q on a curve tt, 
let 7r[p, q] denote the portion of the curve between the two points. 

We denote with B(p, r) the ball of radius r centered at p, and S(p, r) denotes the corresponding 
sphere. Given a set of numbers U C M, an atomic interval of U is a (possibly infinite) maximal 
interval on the real line that does not contain any point of U in its interior. Let T>(P) be the set of 
all pairwise distances of points in P. 

2.2 Frechet Distance and the Free Space Diagram 

A reparameterization is a bijective and continuous function / : [0, 1] — > [0, 1]. It is orientation- 
preserving if /(0) = and f(l) = 1. Given two reparameterizations / and g for two curves tt and 
a, respectively, define their width as 



This can be interpreted as the maximum length of a leash one needs to walk a dog, where the 
dog walks monotonically along tt according to /, while the handler walks monotonically along a 
according to g. In this analogy, the Frechet distance is the shortest possible leash admitting such 
a walk. 

Formally, given two curves tt and a in ]R d , the Frechet distance between them is 



where / and g are orientation-preserving reparameterizations of the curves tt and a, respectively. 
The Frechet distance complies with the triangle inequality; that is, for any three curves tt, a and r 
we have that dgr(jT, r) < dj(-K, a) + dy{a, r). 

Let tt, a be curves and 5 > a parameter, the free space of tt and a of radius 5 is defined as 



width/- j9 (7r, a) 




dy{TT,a) 
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We are interested only in polygonal curves. Then the square [0, l] 2 can be broken into a (not 
necessarily uniform) grid called the free space diagram, where a vertical line corresponds to a 
vertex of tt and a horizontal line corresponds to a vertex of a. Every two segments of ir and a define 
a free space cell in this grid. In particular, let Cj,- = Cij(ir,cr) denote the free space cell that 
corresponds to the ith edge of tt and the jth edge of a. The cell Cij is located in the ith column 
and j'th row of this grid. 

It is known that the free space, for a fixed 6, inside such 
a cell Cjj (i.e., D<s{^,cr) H C,j) is the clipping of an affine 
transformation of a disk to the cell [AG95j . see the figure to 
the right; as such, it is convex and of constant complexity. 
Let 

boundary of C tJ 
at the right boundary 

The Frechet distance between tt and a is at most 6 if and 
only if there is an (x, y)-monotone path in the free space di- 
agram between (0, 0) and (1, 1) that is fully contained in D<s(tt, a 
Let the reachability intervals ■ C I^- and RV • C IV . consist of the points (x, y) on the bound- 
ary that are reachable by a monotone path from (0,0) to (x,y). 

Such a path to (1, 1) can be computed, if it exists, in 0(n 2 ) time by dynamic programming 
where n is the total complexity of the two polygonal curves 7r and a, see |AG95j . 



I^a denote the horizontal free space interval at the top 
and IV . denote the vertical free space interval 




IV. 

1,3 



2.2.1 Free Space Events 

To compute the Frechet distance consider increasing 5 from to oo. As 5 increases, structural 
changes to the free space happen. We are interested in the radii (i.e., the value of 8) of these 
events. 

Consider a segment u of tt and a vertex p of a, a vertex-edge event 
corresponds to the minimum value 5 such that u is tangent to B(p,S). 
In the free space diagram, this corresponds to the event that a free space 
interval that consists of only one point was just created. The line sup- 
porting this boundary edge corresponds to the vertex, and the other 
dimension corresponds to the edge. Naturally, the event could happen at a vertex of u. 

The second type of event, a monotonicity event, corresponds to a value 5 for which a mono- 
tone subpath inside D<$ becomes feasible, see Figure [TJ Geometrically, this corresponds to two 
vertices p and q on one curve and a directed segment u on the other curve such that: (1) u passes 
through the intersection S(p, 5)(lS(q, 5), and (2) u intersects B(q, 5) first and B(p, 5) second, where 
p comes before q in the order along the curve tt. 

Other values of 6 that would be relevant to our algorithm are the distances between any pair 
of points of V(tt) U V(cr). Technically, apart from the two single events that the endpoints of the 
curves are being matched to each other, these vertex-vertex events are vertex-edge events when 
they are relevant, but they will be handled naturally by our algorithm. 




2.3 Curve Simplification 

We suggest a straightforward greedy algorithm for curve simplification, which is sufficient for our 
purposes. We comment that Agarwal et al. [AHMW05| suggested a more aggressive (but slightly 
slower and more complicated) simplification algorithm that can be used instead. 
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Figure 1: Two curves ir and a and their free space diagram D<s(ir, a), where p = ir(s),q = ir(s') 
and r = (J it). Here, 5 is the minimal free space parameter, such that a monotone path exists, i.e., 
in this example dj(ir, a) coincides with a monotonicity event. 

Algorithm 2.1 Given a polygonal curve ir = P1P2P3 ■ ■ - Pk an d a parameter /1 > 0, consider the 
following simplification algorithm: First mark the initial vertex p\ and set it as the current vertex. 
Now scan the polygonal curve from the current vertex until it reaches the first vertex pi that is in 
distance at least (i from the current vertex. Mark pi and set it as the current vertex. Repeat this 
until reaching the final vertex of the curve, and also mark this final vertex. Consider the curve 
that connects only the marked vertices, in their order along ir. We refer to the resulting curve 
7r' = simpl(-7r, fi) as the ^-simplification of ir. Note, that this simplification can be computed in 
linear time. 

Remark 2.2 The simplified curve has the useful property that all its segments are of length at 
least //, except for the last edge that might be shorter. For the sake of simplicity of exposition, we 
assume that the last segment in the simplified curve also has length at least \x. Our arguments can 
be easily modified to handle this more general case. 

Lemma 2.3 For any polygonal curve ir in JR d , and [i > 0, it holds d$(ir, simpl(-7r, /x)) < [i. 

Proof: Consider a segment u of simpl(7r, fj,) and the portion if of ir that corresponds 
to it. Clearly, all the vertices of 7? are contained inside a ball of radius fi centered 
at the first endpoint of u visited by ir, except the last vertex of if. As such, one can 
parameterize u and if, such that initially the point stays on the vertex of u while 
visiting all vertices of if (except the last one) , and then simultaneously move in sync on u and the 
last segment of if, in such a way that the distance is always at most /x. ■ 

3 The Approximation Algorithm 

3.1 Computing the Reachable Free Space 

For two curves ir and a, their reachable free space, denoted by 3?<,5(7r, cr), is the set of all the 
points of D<s(ir, a) that are reachable from (0,0) by an (x,y)-monotone path. 

The set 3?<<s has finite descriptive complexity inside each grid cell, and we need to describe 
it only for the grid cells that have non-empty intersection with 3l<$. Clearly, generating only 
those grid cells is sufficient to decide if there is a monotone path between (0, 0) and (1, 1), which is 
equivalent to deciding if the Frechet distance between ir and a is smaller or equal to 5. In particular, 
to fully describe $.<s, we will specify the reachability intervals C 1^ ■ and RVj C IV . for each 
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cell Cij, which describe the intersection of 3?<5 with the top and right boundary of Cij. These 
intervals contain all the needed information, since 3l<s H Cij is convex. 

The complexity of the reachable free space, for distance S, denoted by N<s(ir, a), is the total 
number of grid cells which have non-empty intersection with Jl<s- One can compute this set of 
cells and extract an existing monotone path in 0(N<s(tt,o-)) time, by performing a BFS of the grid 
cells that visits only the reachable cells. This yields the following relatively easy result. We include 
the details both for the sake of completeness and because the algorithm we suggest is engagingly 
simple. 

Lemma 3.1 Given two polygonal curves tt and a in TR d , and a parameter 5 > 0, one can compute 
a representation of Jl<s('K,o~) in 0{N<${tt, a)) time. Furthermore, one can decide if dy(TT,a) < 5, 
and if this is the case also extract reparametrizations in 0(N<$(ir,cr)) time. 

Proof: We create a directed graph G that has a node v(i,j) for every reachable free space cell 
Cij. With each node v(i,j) we store the free space intervals and as well as the reachability 
intervals C J?». and BS 4 C If 

Each node v(i,j) can have an outgoing edge to its right and top neighbor; an edge between these 
vertices exists if and only if the corresponding reachability interval between them is nonempty. In 
particular, a monotone path from (0, 0) to a point (x, y) £ Cjj in 3?<^ corresponds to a monotone 
path in the graph G from v(l, 1) to v(i,j). Furthermore, any such monotone path has exactly 
k = i + j — 2 edges on it. 

We compute the graph G on the fly by performing a BFS on it, starting from v(l,l), and keeping 
the invariant that when the bfs visits a node v(i,j) it enqueues the vertices v (i, j + and v(i+l,j), 
in this order, to the BFS queue (if they are connected to v(i,j), naturally). 

This implies that at any point in time, and for any k, the bfs queue 
contains the nodes on the kth diagonal (i.e., all nodes v(i,j) such that i+j = 
k — 1) of the diagram sorted from left to right. However, the same node might 
appear twice (consecutively) in this queue. 

In every iteration, the BFS dequeues the one or two copies of the same 
node v(i,j) and merges the two copies of the same vertex into one if necessary. Now, the one 
or two vertices (i.e., v(i — 1, j) and v(i,j — 1)) that have incoming edges to v(i,j) are known, as 
are their reachability intervals. Therefore one can compute the reachability intervals for v(i,j) in 
constant time. Now, v(i,j + 1) is enqueued if and only if the top side of the cell Cij is reachable 
by a monotone path (i.e., i?^ ^ 0), and v{i + 1, j) is enqueued if and only if the right side of the 
cell Cij is reachable by a monotone path (i.e., / 0). Since ^R<<5 (7r, a) n C^j is convex and of 
constant complexity, this can be done in constant time. 

Clearly, the bfs takes time linear in the size of G and it computes the reachability information for 
all reachable free space cells of ^R<<5 (?r, cr). Now, one can check if (1, 1) is reachable by inspecting the 
reachability intervals for C nw -i jncr -i, and checking if the top right corner of this cell is monotonically 
reachable from the origin, where is the number of vertices of the curve tt. The monotone path 
realizing this can be extracted in linear time, by introducing backward edges in the graph and 
tracing a path back to the origin. ■ 

Observation 3.2 One can compute all relevant vertex-edge events with radius < 5 in 0(N<s(ir, a)) 
time as follows. We compute the graph representation of JI<s(tt, a) using Lemma 13.11 Next, for 
each reachable cell consider the vertex-edge events at its top and right boundaries and compute 
their event radii. Recall that a cell boundary corresponds to an edge from the one curve and a 
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Figure 2: The idea of the fuzzy decision procedure using simplification. 

vertex from the other curve. Clearly, any cell boundary can be used by the reparameterization of 
width < 5, if and only if the corresponding event radius is smaller or equal 5. 

3.2 The Approximate Decision Procedure 

In the following, we are interested in the maximum complexity of the reachable free space when 
considering any radius 5 and simplifying the curves with radius e<5. The reasons will become 
apparent only shortly after, in Lemma 13.51 and Lemma 13.61 where we show that the simplification 
radius chosen this way enables us to either (i) compute a (1 + e)-approximation of the Frechet 
distance, or (ii) solve the decision problem exactly using the simplified curves (see Section [3.3.5|) . 

The idea underlying this approximate decision procedure is depicted in Figure [2J We simplify 
the two input curves to a resolution that is (roughly) an e-fraction of the radius we care about 
(i.e., 5), and we then use the exact decision procedure on these two simplified curves. Since the 
Frechet distance complies with the triangle inequality and by Lemma 12.31 we can infer the original 
distance from this information. In order for this approach to work, the complexity of the reachable 
free space for the two simplified curves has to be small. This notion of complexity is captured by 
the following definition. 

Definition 3.3 For two curves ir and a, let 

N(e, vr,cr) = max iV<5 (simpl(7r, e#) , simpl(a, e5)) 

be the maximum complexity of the reachable free space for the simplified curves. We refer to 
N(e, 7r, a) as the e-relative free space complexity of tt and a. In order to give a more informative 
analysis, we will express the asymptotic time complexity of our algorithms not in terms of the size 
of the input, but instead use the size of the input and the free space complexity of the input as 
parameters. 

We assume that for any < s < 1 the following properties hold for l\l(-, •, •). 

(PI) For any constant d > 1, it holds N(e/c', vr, a) = 0(N(e, vr, a)). 
(P2) N(e,7r,cj) < N(e/2,7r,cj)/2. 

The above properties will hold for all the families of curves we consider. In Section \A. II we show 
that N(e,7r, a) is a linear function in the number of vertices of the two curves for a fixed e > if 
the curves are sufficiently well-behaved (see for example Lemma [4.4p . Combining this analysis with 
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the time complexity analysis of the algorithms will yield near-linear upper bounds on the running 
times of these algorithms for the classes of curves considered. 

Remark 3.4 In the following, when we state the time complexity of our algorithms, we always 
assume that N(e,TT,a) = Q(n), where n is the total number of vertices of tt and a. 

Lemma 3.5 Let tt and a be polygonal curves in IR d , and let e > and 5 > be two parameters. 
Then, the algorithm described below output, in 0(N(e, tt, a)) time, one of the following : 

(A) "dj(7r,a) < (1 + e)5", and reparameterizations of tt and a of width < (1 + e)5, and this 
happens ifdy(iT,a) < 6. 

(B) "dy(7r, a) > 5 " if d?(7r, a) > (1 + e)S. 

(C) Ifdj-(Tr,a) £ (5,(1 + e)5] then the algorithm outputs either of the above outcomes. 
In either case, the statement returned is correct. 

Proof: Set \i = (e/4)<5. Compute in linear time the curves tt' = simpl(7r, fi) and a' = simpl(cr, /x) 
using Algorithm [2TTJ Let 5' = <5 + 2/i and observe that fj>/5' = e/(4 + 2e). Using Lemma [3 .11 we can 
decide whether dy(Tr' ,a') < 5' in 

0(N< s ,(Tr',a')) =0(N(^/5',n,a)) =0(N(e/(4 + 2e),TT,a)) = 0(U(e,TT,a)) 

time, by assumption (F[T|). If so, we output the reparameterizations as a proof that 

d?(TT, a) < dj(TT, tt') + ^(vr', a') + d?(<j' , a) < 5' + 2/x = 5 + 4(e/4)«5 = (1 + e)5. 

On the other hand, if dy(7r', a') > 5', then this implies, by the triangle inequality, that 

d?(ir, a) > dj(-7r', a') - dj(ir, tt') - dj(a', a) > 5' - 2fi = 5. 

Therefore, the algorithm outputs ll dj(TT, a) > 5" in this case. ■ 

3.2.1 How to use the Approximate Decider in a Binary Search 

In order to use Lemma 13.51 to perform a binary search for the Frechet distance, we can turn the 
"fuzzy" decision procedure into a precise one as follows. 

Lemma 3.6 Let tt and a be two polygonal curves in , and let 1 > e > and 5 > be two 
parameters. Then, there is an algorithm decider (tt, a, 5, e) that, in 0(N(e,TT,a)) time, returns one 
of the following outputs: (i) a (1+e) -approximation to dy(-K,a), (ii) dy(-K,a) < 5, or (Hi) d-j(-K,a) > 
5. The answer returned is correct. 

Proof: Let 5' = 5/(l+e'), for e' = ce, c = 1/3. We run the algorithm of Lemma [33] with parameters 
5 and e' . If the call returns u dj(TT, a) > 5", then we return this result. 

Otherwise, we call Lemma f3.5l with parameters 5' and e' . If it returns that u d^(ir, a) < (1+e') J'" 
then d^(TT,a) < (1 + s')6' = 5, and we return this result. 

The only remaining possibility is that the two calls returned il d$(ir,cr) < (1 + e')S" and 
"<ij(7r,o") > 5'". But then we have found the required approximation. Therefore, the result- 
ing approximation factor of the reparameterizations returned by the call with 5 is < - — \ - = 

6' 

(1 + ce) 2 <(l + e) as can be easily verified, since < e < 1. ■ 
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3.3 Searching for the Frechet Distance 



3.3.1 Searching in a Fixed Interval 

It is now straightforward to perform a binary search on an interval [a, 0] to approximate the value 
of the Frechet distance, if it falls inside this interval. Indeed, partition this interval into subintervals 
of length ea and perform a binary search to find the interval that contains the Frechet distance. 
There are 0(/3/ea) intervals, and this would require 0(log(/3/ea)) calls to decider. By using 
exponential subintervals, one can do slightly better, as testified by the following lemma. 

Lemma 3.7 Given two curves ir and a in IR d , a parameter 1 > e > 0, and an interval [a, (5), one 
can perform a binary search in [a,/3] and obtain a (1 + e)- approximation to dgr(Tr,a) if dj(7r,a) £ 
[a,/3], or report that d$(ir,a) ^ [«,/?]• The algorithm, denoted by searchlnterval(7r, a, [a, /3], e), 

takes of log ^°^^ a M m // s to decider. 



Proof: Let on = a (1 + ef for i = 0, . . . , M = [log 1+e (/3/a)\ and otM+i = P- Perform a binary 
search, using decider(-7r, a, 5, e) to find the two values and such that a, < 5 = dj(TT, cr) < 
ttj+i. Since a^+i = (1 + e)a«, we conclude that we found the required approximation. 

It might be that during this procedure one of the calls to decider(-7r, cr, 5, e) found the required 
approximation, and in this case we abort the binary search and just return this approximation. 

This process requires O(logM) = 0(loglog 1+e (/3/a)) calls to decider. Observe that 

P HP/a) /3 
M = log 1+e - = = O - log - 

T a m(l + e) \e a 

Indeed, e x l 2 < 1 + x < e x for x & [0, 1], and this implies that x/2 < ln(l + x) < x, which is the 
inequality used above. ■ 



3.3.2 Searching over Events 

Clearly, the procedure searchlnterval(-7r, cr, [a, 0],s) alone does not suffice to solve our main prob- 
lem, since the interval of distances we are searching over might have arbitrarily large "spread" (i.e., 
log (3 jot might be arbitrarily large). However, the Frechet distance must be sufficiently close to a 
free space event in one of the "approximate" diagrams, i.e., a free space diagram of the two simpli- 
fied curves. Thus, we can identify two kinds of critical values to search over, which are candidate 
values for the approximate Frechet distance. These are the events where (i) the simplification of 
an input curve changes, or (ii) the reachability within the approximate free space diagram changes 
(i.e., a free space event; see Section [2.2. ip . 

The traditional solution to overcome this problem is to use parametric search. However, in our 
case, since we are only interested in approximation, we can use a simpler, "approximate" , search. It 
is sufficient to search over a set of values which approximate the event values by a constant factor, 
since we will use Lemma 13.71 to refine the resulting search interval in the main algorithm. Note, for 
instance, that we can easily use this lemma to turn a constant factor approximation of the Frechet 
distance into a (1 + e)-approximation. 

Algorithm 3.8 Let searchEvents(7r, a, Z) denote the algorithm that performs a binary search 
over the values of Z, to compute the atomic interval of Z that contains the Frechet distance between 
7r and cr. This procedure uses decider (Lemma 13. 6|) to perform the decisions during the search. 
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3.3.3 Searching over Simplifications 

Consider the events when the simplified curves change, see Algorithm 12.11 Consider the set of all 
pairwise distances between vertices of ir and a. Observe that it breaks the real line into Q) + 1 
atomic intervals, such that in each such interval the simplification does not change. Thus simpl(7r, fi) 
(resp. simpl(<7, //)) might result in 0(n 2 ) different curves depending on the value of fj,, where n is the 
total number of vertices of tt and a. As a first step we would therefore like to use Algorithm 13.81 to 
perform a binary search over those distances to find the atomic interval that contains the required 
Frechet distance. Naively, this would require us to perform distance selection. However, it is 
believed that exact distance selection requires time in the worst case |Eri95| . To overcome 

this we will perform an approximate distance selection, as suggested by Aronov et al. [AHK + 06j . 

Lemma 3.9 Given a set P ofn points in lR d . Then, one can compute in O(ralogra) time a set Z of 
0{n) numbers, such that for any y G D(P), there exist numbers x, x' G Z such that x < y < x' < 2x. 
Let approxDistances(P) denote this algorithm. 

Proof: Compute an 8- well-separated pairs decomposition of P. Using the algorithm of Callahan 
and Kosaraju [CK95] this can be done in 0(n log n) time, and results in a set of pairs of subsets 
{(X\,Yi), . . . , (X m ,Y m )}, where m = 0{n), such that for any two points p, q G P there exists a 
pair (Xi,Yi) in the above decomposition, such that: (i) p G Xi and q G Y% (or vice versa), and 
(ii) max(diam(Aj) ,diam(yi)) < min^gx^G^ \\Pi - /8. 

This implies that the distance of any pair of points in Xi and Yi, respectively, are the same 
up to a small constant. As such, for every pair (Aj,l^), for i = 1, . . . ,m, we pick representative 
points pi G Xi and G Yi, and set ii = (3/4) \\pi — qi\\. Let Z = {ii, . . . , i m , 2i\, . . . , 2i m } be the 
computed set of values. 

Consider any pair of points p,q G P. For the specific pair [Xi,Yi) that contains the pair of 
points p and q that we are interested in, we have that ii = (3/4) \\pi — qi\\ < \\pi — qi\\ — diam(Aj) — 
diam(yj) < \\p — q\\ < \\pi — qi\\ + diam(Aj) + diam(li) < (5/4) \\pi — qi\\ < 2£i, thus establishing 
the claim. ■ 

3.3.4 Monotonicity Events 

The following lemma testifies that the radius of a monotonicity event must be "close" to either a 
vertex-edge event or to the distance between two vertices. Since we will approximate the vertex- 
vertex distances and perform a binary search over them, this implies that we further only need 
to consider vertex-edge events. Furthermore, by Observation 13.21 the number of those vertex-edge 
events which remain in the resulting search range can be bounded by the complexity of the reachable 
free space. 

Lemma 3.10 Let x be the radius of a monotonicity event involving vertices p,q and a segment u. 
Then there exists a number y such that y/2 < x < 3y, and y is either inW = Diy^) U V(a)) or 
y is the radius of a vertex-edge event. 

Proof: Let s be the intersection point of S(p,x) n S(q,x) which lies on u. Let p' (resp. q') be the 
closest point on u to p (resp. q). 



11 



Clearly \\p' — q'\\ < \\p — q\\ (since the projection onto the nearest 
neighbor of a convex set is a contraction), and since p' G B(p,x) and 
q' G B(q, x), the point s lies on the segment p'q' . 

This implies that x = \\p — s\\ < \\p — p'\\ + \\p' — s\\ < \\p — p'\\ + 
\\p' — q'\\ < \\p — p'\\ + \\p — q\\, by the triangle inequality. 

A similar argument implies that 

x = \\p ~ s \\ > \\p ~ p'\\ ~ \\p' ~ s \\ > \\p ~ p'\\ ~ \\p' ~ q' 1 1 > \\p ~ p'\\ ~ \\p ~ q\\ ■ 

If \\p — p'\\ > 2 \\p — q\\ then the above implies that x G [1/2, 3/2] \\p — p'\\. If p' is an endpoint 
of u then \\p — p'\\ is in W. Otherwise, \\p — p'\\ is the radius of the vertex-edge event between p 
and u. In either case, this implies the claim. 

If \\p — p'\\ < 2 \\p — q\\ then x = \\p — s\\ < \\p — p'\\ + \\p — q\\ < 2 \\p — q\\ + \\p — q\\ = 3 ||p — q\\, 
and of course \\p — q\\ £ W. Now, the two balls of radius x centered at p and q, respectively, cover 
the segment pq, and we have that \\p — q\\ /2 < x, which implies the claim. ■ 

3.3.5 Searching with a Fixed Simplification 

Assume that we have found simplifications r and i], such that the Frechet distance of those curves 
yields the desired (1 + e)-approximation. Clearly, an approximation of dj-(r,r]) suffices for our 
result. To this end, let searchIntervalNoSimp(7r, a, [a,/3],e) be the variant of searchlnterval 
from Lemma 13.71 that uses Lemma 13.11 directly instead of calling decider. This version searches 
for the Frechet distance in the given interval, but does not perform simplification before calling 
the decision procedure. It returns a (1 + e)-approximation of the Frechet distance, given that it is 
contained in this interval. Note that correctness and running time of Lemma 13.71 are not affected 
by this modification. 

Lemma 3.11 Let r and rj be two given curves in H d , with total complexity n, and let [h~,h + ] be 
an interval, such that (i) dj(T,r]) £ [h~~,h + ], and (ii) there is no value of W = T>(V(t) U V{rj)) 
in the interval [h~,h + ]. Then, for e > 0, one can (1 + e)- approximate dj(r,rj) and compute 
reparametrizations in 0({n + N) log(A r /e)) time, where N = N <h +(r, rf). 
Let aprxFrechetNoSimp(r, i], [h~ , h + ], e) denote this algorithm. 

Proof: For two real numbers x,y > 0, we define [x/y] = max(x, y)/min(x,y). 

Compute 0l <h + (r, i]), using Lemma I3TT1 Next, using Observation 13.21 compute from % <h +{r, rj) 
the set Z of all the radii of the vertex-edge events of r and rj with radius at most h + . Next, we 
sort Z, and perform a binary search over Z, using Lemma |3.1| for the atomic interval I = [a, f3] of 
Z that contains the Frechet distance d<j{r,ir]). Next, call searchIntervalNoSimp(r, r\, [a, 4a], e) 
and searchIntervalNoSimp(r, i], [/3/4, /?], e). We claim that one of these two searches performed 
on the respective intervals will discover two consecutive values x and (1 + e)x, such that the two 
corresponding calls to the algorithm of Lemma 13 . 71 imply that dy{r, rj) G [x, (1 + e)x}. 

Indeed, the interior of [a, (3} does not contain any value in W or a radius of a vertex-edge event 
of r and i]. Therefore, the interval [a, (3\ might contain only monotonicity events of r and r\. By 
Lemma 13.101 for a monotonicity event with radius r there exists a y G Z U W, such that [r/y] < 3. 
But since there is no value of Z U W in the interior of [a, /3], and therefore, for any r" G [4a,/3/4] 
and y" G Z U W, we have that [r"/y"] > 4. 
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aprxFrechetI(7r, a, e ) 

(A) P = V(tt) U V(a) 

(B) Z <— approxDistances(P) (Lemma I3.9p , 

(C) [a, (3] <r- searchEvents(7r, a, Z, e) (Algorithm [378|) . 

(D) Call searchlnterval(7r, a, [a,Aa'],e), where a' = (30/e)a (Lemma |3~77|) . 

(E) Call searchlnterval(7r, a, [/3'/4, p], e), where /3' = j3/3. 

(F) Let ir' = simpl(7r,/x) and a 1 = simpl(o", fi), for /i = 3a (Algorithm 12. ip 

(G) <5 <- aprxFrechetNoSimp(vr',o-', [a', /?'], e/4) (Lemma EH]) . 

(H) Compute and return the resulting reparameterizations of n and a and their width 
as the approximation. 

Figure 3: The basic approximation algorithm. 

We conclude that no monotonicity event, vertex-edge event, or value of W lies in the interval 
[4a,/3/4]. Since the Frechet distance must be equal to one such value, it follows that ^(t, r/) G' 
(4a,/3/4), but this implies that either dj(r, r/) G [a, 4a] or dj(T,rj) G [/3/4, /?]. In either case, the 
above algorithm would have found the approximate distance. 

Computing and sorting the set of vertex-edge events takes 0(N log N) time by Observation 13.21 
The binary search requires 0(log|Z|) calls to the algorithm of Lemma l3.1[ The two calls to search- 
IntervalNoSimp require 0(log(l/e)) calls to Lemma [3. 1[ Now, observe that all these calls to the 
algorithm of Lemma 13.11 are done with values of 5 < h + . Thus the complexity of the reachable free 
space is bounded (up to a constant factor) by the number of vertex-edge events of values < h + , 
and this number is bounded by \Z\. Therefore, a call to Lemma 13. II takes 0(|Z|) time. Thus, the 
overall running time is 0((n + \Z\) log(|Z|/e)), and by definition \Z\ = 0(N< h + (r, ry)) . ■ 

3.4 The Approximation Algorithm 

The resulting approximation algorithm is depicted in Figure [31 It will be used by the final approx- 
imation algorithm as a subroutine. We first analyze this basic algorithm. We will then show how 
to use it, in Lemma 13.151 below, to get a faster approximation algorithm. The algorithm depicted 
in Figure [3] performs numerous calls to decider, with approximation parameter e > 0. If any of 
these calls discover the approximate distance, then the algorithm immediately stops and returns 
the approximation. Therefore, at any point in the execution of the algorithm, the assumption is 
that all previous calls to decider returned a direction where the optimal distance must lie. In 
particular, a call to searchlnterval(7r, a, I, e), would either find the approximate distance in the 
interval X and return immediately, or the desired value is outside this interval. 

3.4.1 Correctness 

Lemma 3.12 Given two polygonal curves it and a, and a parameter 1 > e > 0, the algorithm 
aprxFrechetI(7r, a, e) computes a (1 + e)- approximation to d^(TV,a). 

Proof: If the algorithm found the approximation before step ([F]), then clearly it is the desired 
approximation, and we are done. (In particular, this must be the case if 4a' > /?'/4.) 

Otherwise, because of ([C]), we know that dj-(ir,a) G [a,/?]. By steps (|D]) and (jE)) it must be 
that dy(n,a) G [4a',/3'/4]. Since /j = 3a = (e/10)a' < f3'/4, it follows, by the triangle inequality, 
that 

dy(7rV) <d?{n',ir) + d?(ir,a) + d?(a,a') < 2fx + /3'/4 < 0. 
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A similar argument shows that dy(ir' , a') > a'. Hence, the algorithm of Lemma [3.11l can be applied 
to tt' and a' for the range [a',j3'], as dj(ir' , a') £ [a',/3']. 

Now, by Lemma 13.111 we have that the value 6 resulting from step §G\i . is contained in the 
interval [dy(Tr' ,a') , (1 + e/4)dj(ir' , a')]. By the triangle inequality we conclude that the returned 
Frechet distance is 

A < dg-(7r,vr') + 5 + dy(a,a') < dj^vr') + (1 + e/4)cZgr(7r', a') + d 7 (a',a) 
< (1 + e/4)(2/i + dgr(7r, a) + 2/i) < 5/x + (1 + e/4)dgr(7r, <r) < (1 + e)^(vr, a) , 

since 5/x = 15a = (e/2)(30/e)a = (e/2)a' < (e/2)d?(ir, a). 

Note that A > dj-(n, a) since it is the width of a specific reparameterization between the two 
curves. ■ 



3.4.2 Running Time 

Lemma 3.13 For any x,y £ (2a, [3/2), we have simpl(7r,x) = simpl(7r, y) and simpl(<r, x) = 
simpl(<j,y). 

Proof: Indeed, the interval (a, (3) does not contain any value of Z. As such, by Lemma 13.91 
(2a, (3/2) does not contain any value of the pairwise distances between vertices of the vertex set of 
7r and <r which implies that the simplification is the same for any value inside this interval. ■ 

Lemma 3.14 Given two polygonal curves tt and a with a total ofn vertices in JR d , and a parameter 
1 > e > 0, the running time of aprxFrechetI(-7r, a, e) is 0(N(e, tt, a) log n). 

Proof: Computing Z (and sorting it) takes 0(n log n) time by Lemma [3791 Steps ([C]). (jD|) and ((Ej) 
perform 0(logn + log(l/e)) = O(logn) calls to decider, by Lemma 13.71 (Here, we assume that 
e = J7(l/n). If £ < 1/n then we can just use the algorithm of Alt and Godau [AG95] since its 
running time is faster than our approximation algorithm in this case.) Each call to decider takes 
0(N(e, tt, a)) time, so overall this takes 0(N(e, tt, a) log n) time. Computing the simplifications in 
step ([F]) with Algorithm 12.11 takes 0(n) time. 

By Lemma[3j31 a call to aprxFrechetNoSimp(vr', a', [a', (3'], e/4) takes T = 0((n+N) \og(N/e)) 
time, with iV = N<pr(ir', a'). Now, 3a and (3' are both inside the interval (2a, (3/2), and as such, by 
Lemma 13.131 we have that tt' = simpl(7r,3a) = simpl(7r, j3') and a 1 = simpl(cr, 3a) = simpl(<r, f3'). 
Therefore, we have that 

N = N< p ,(tt',o-') = iV<^/(simpl(7r,^') , simpler,/?')) < N(l, tt, a) . 

Thus, step (|Gl) takes T = 0(N(1, tt, a) log(N(l,7r, a) n/e)) = 0(N(1, tt, a) logn), time since N(l,vr,cr) < 
n 2 and e = £l(l/n). Observe that N(l,7r,cr) < N(e, tt,u) for e < 1. 

Finally, in order to compute the resulting reparameterizations in step (jHj), we compute the 
reparametrizations of tt and tt' (resp. a and a') as described in the proof of Lemma 12.31 and chain 
them with the reparameterizations of the simplified curves, which we obtained from step (iG|) . 
Clearly, this and computing the resulting width takes 0(n) time. Note that by the assumption in 
Remark 13.41 the term N(e, tt,o~) dominates over 0(n). ■ 

The running time of Lemma 13.141 can be slightly improved. 

Lemma 3.15 The algorithm aprxFrechetl depicted in Figure^ can be modified to run in time 
0(N(e, TT,a) + N(l,7r,o-)logn) (see Definition] 3. 3\). 
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Proof: Use Lemma [3.14| with eq = 1/2, to get a 2-approximation £ for the Frechet distance between 
7r and a. This takes 0(N(1, w, a) log n) time. Let Xo = [£, 2£] be the corresponding interval that 
contains the distance. We could call searchlnterval(7r, c, Iq, e) and get a (1 + e)-approximation 
in 0(N(e, ir, a) log ^ + N(l, 7r, <t) logn) time. 

One can do better by starting with a "large" e and decreasing it during the binary search for 
the right value performed by searchlnterval. This is a standard idea and it was also used by 
Aronov and Har-Peled |AH08| . 

Indeed, assume that in the beginning of the ith step, we know that the required Frechet distance 
lies in an interval Xj_i = [aj_i,/3j_i] and — = ||Xo|| where £j_i = 1/2*" 1 . 

Let Aj_i = ||X_i|| = /3j_i — Oj-i, and let Xij = cti-i + j'Aj_i/4, for j = 0, 1,2,3,4. Call the 
procedure decider on three values xn, Xi2, and ^3, with the approximation parameter being 
ci£j, for c\ > being a sufficiently small constant. Based on the outcome of these three calls, we 
can determine in constant time which of the three intervals Jn \ = [ait o> 2^2]? &2 = [aitl; 3 ^], or 
J%,z = [ait,2) aij,4] must contain the Frechet distance. We set this interval to be X,. 

We repeat this process for M steps, where M = [lgl/e]. It is easy to verify that the fi- 
nal interval now provides the required approximation. The running time of this algorithm is 

0( N(1,tt, a) logn + YldLi N( e i> 7r > cr )) • Now, by assumption (P|2]) (see Definition 13. 3[) . we have 



= 0(H(e,ir,a)), 

and this implies the claim. 



The Result. Putting the above together, we get the following result. 

Theorem 3.16 Given two polygonal curves tt and a with a total of n vertices in TR d , and a 
parameter 1 > e > 0, one can (1 + e)- approximate the Frechet distance between tt and a in 
0(N(e, tt, a) + N(l, n, a) logn) time (see Definition Wlfy . 



Interestingly, simplification is critical for the efficiency of the above algorithm. 
Indeed, consider the two nicely behaved curves depicted on the right. The reachable 
portion of the free space diagram of these two curves, for the distance realizing the 
Frechet distance, covers a quadratic number of cells. 




The use of simplification by itself is not sufficient to guarantee that the presented 
I algorithm is efficient. Indeed, in might not be possible to simplify the input curves 

I — at all without losing too much information. In such contrived worst case examples, 

I — the free space diagram still has quadratic complexity due to the inherent structure 

of the curves. See the figure to the left for one such example. In the next section 
we will analyze the relative free space complexity using realistic input models and 
prove the efficiency of the above algorithm, given that the input is "realistic" . 



4 The Relative Free Space Complexity of Families of Curves 

In this section we are going to bound the relative free space complexity for different realistic input 
models of curves. We will introduce the new class of c-packed curves, and we compare this new 
input model to the previous models of n-boundedness and low density. 
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4.1 On c-packed Curves 

We introduce a new family of curves, c-packed curves, and prove that their relative free space com- 
plexity N(e, tt, a) is linear, for any two curves tt and a in this family. This implies that Theorem l3.16l 
works in near linear time for c-packed curves, which is one of our main results. 

4.1.1 Definition and basic properties 

Definition 4.1 A curve tt in IR d is c-packed if for any point p in JR d and any radius r > 0, the 
total length of tt inside the ball B(p,r) is at most cr. 

Lemma 4.2 Let tt be a curve in TR d , fi > be a parameter, and let ir' = simpl(7r, fx) be the 
simplified curve. Then \\tt fl B(p, r + fj,) \\ > \\tt' fl B(p, r)|| for any ball B(p, r). 

Proof: Let u be a segment of tt' that intersects B(p, r) and let v = u n B(p, r) be this intersection. 
Let 7r u be the portion of tt that got simplified into u. Observe that tt u is a polygonal curve that lies 
inside a hippodrome of radius fi around u; that is, ir u C Ti u = u © B(0,p), where © denotes the 
Minkowski sum of the two sets, see the figure on the right. 

In particular, erect two hyperplanes passing through the 
endpoints of v that are orthogonal to v, and observe that tt u 
must intersect both hyperplanes. Hence, we conclude that 
the portions of tt u in the hippodrome H v = v© B(0, /i) are of 
length at least ||v||. Clearly, v C B(p, r) implies that T~L V C B(p, r + fj,), which in turn implies that 
vr u n T~L\i C B(p, r + fi) and thus 1 1 vr u n B(p, r + > ||v||. 

Summing over all segments v in tt' n B(p, r) implies the claim. ■ 

Lemma 4.3 Let tt be a c-packed curve in lR d , p > be a parameter, and let tt' = simpl(7r, fx) be 
the simplified curve. Then, tt' is a Qc-packed curve. 

Proof: Assume, for the sake of contradiction, that \\tt' n B(p, r)|| > 6cr for some B(p,r) in ]R d . 
If r > fi, then set r' = 2r and Lemma 14.21 implies that \\tt n B(p, r')|| > \\tt n B(p,r + p)\\ > 
\\tt' n B(p, r)|| > 6cr = 3cr', which contradicts that tt is c-packed. 

If r < fi then let U denote the segments of tt' intersecting B(p,r) and let k = \U\. Observe 
that k > 6cr/2r = 3c, as any segment can contribute at most 2r to the length of tt' inside B{p, r). 
Therefore we have that \\tt' n B(p, 2/i)|| > \\tt' n B(p, r + fi)\\ > \\U fl B(p, r + fi)\\ > kp, since every 
segment of the simplified curve tt' has a minimal length of ^. By Lemma 14.21 this implies that 
\\tt n B(p, 3p)\\ > \\tt' n B(p, 2/i)|| > kfi > 3c^i, which is a contradiction to the c-packedness of tt. ■ 

4.1.2 Bounding the relative free space complexity 

Lemma 4.4 For any two c-packed curves tt and a in lR d , and < e < 1, we have that N(e, tt, a) = 
0{cn/e). 

Proof: Let 5 > be an arbitrary number, /x = e5, tt' = simpl(7r,/x) and a' = simpl(<7, /i) 

We need to show that the complexity of D<s(k' , o 7 ) is 0(cnje). A free space cell of D<${tt' ', cr') 
corresponds to two segments u G i' and v £ a' . The free space in this cell is non-empty if and only 
if there are two points p E u and q £ v such that \\p — q\\ < 5. We charge this pair of points to the 
shorter of the two segments. We claim that a segment cannot be charged too many times. 
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Indeed, consider a segment u E ir', and consider the ball B of radius 
r = (3/2) ||u|| + 5 centered at the midpoint of u, see the figure on the 
right. Every segment v € a' that participates in a close pair as above and 
charges u for it, is of length at least ||u||, and the length of v n B is at 
least ||u||. Since a' is 6c-packed, by Lemma 14.31 we have that the number 
of such charges is at most 

d = h'nB\\ < 6cr = 6c((3/2)||u||+,$) < ^ + 6c6_ = Q ,c\ 
||u|| ~~ ||u|| ||u|| fi \eJ ' 

since ||u|| > fx. 

We conclude that there are at most c'n free space cells that contain a point of D<$. The 
complexity of the free space inside a cell is a constant, thus implying the claim. ■ 

By plugging the above into Theorem 13. 161 we get the following result. 

Theorem 4.5 Given two polygonal c-packed curves ir and a with a total of n vertices in TR d , and 
a parameter 1 > e > 0, one can (1 + e) -approximate the Frechet distance between ir and a in 
0(cn/e + cn log n) time. 




4.2 Relative Free Space Complexity of Low Density Curves 

Definition 4.6 A polygonal curve ir in TR d is <fi-low- density if any ball B(p, r) intersects at most 
<j) segments of it that are longer than r. 

First, observe that this input model is less restrictive than the input model which describes 
c-packed curves. It can be easily seen by a simple packing argument that a polygonal c-packed 
curve is (^-low-density, for <j) = 2c. For any ball B = B(p, r), consider the ball with the same center 
that has radius r' = 2r. Any edge intersecting B that is longer than r must contribute at least r 
to the length of the intersection of the curve with the larger ball, which is bounded by cr' . There 
can be at most cr' jr = 2c edges of this type. 

A curve that is low density, however, is not necessarily c-packed for a small value 
of c. Indeed, a low density curve n might have an arbitrarily long intersection with 
a ball by having sufficiently small segments, see the figure on the right. However, in 
this case ir must have many vertices in the areas where its length cannot be bounded, 
as we will show in the following section. 




Claim 4.7 Let it be a 4>-low density polygonal curve, and let C be a hypercube in IR with sidelength 
£. Then, the number of edges of length > £ of tt that intersect C is bounded by c^0, where c^ = 



Vd/2 



Proof: Partition the cube C into a D x D x • • • x D grid, for D = \fdj2 
intersects C that has length > £ must intersect one of the hypercubes in t 
this grid has diameter 



Clearly, any edge that 
lis grid. A hypercube of 



Vd£ „ Vd£ 



D 



< 



Vd/2 



< 2£, 



and is included in a ball of radius £. Thus, a hypercube in this grid intersects at most 
edges. We conclude that there can be at most 4>D d long edges intersecting C. 



such long 
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4.2.1 Low density curves can be long only if they pay for it 

Lemma [4.8l below testifies that the parts of a low density curve, where its length cannot be bounded 
by a constant, can be covered with hypercubes, such that each cube intersects at most a constant 
number of edges and at most a constant number of other cubes. We use this construction in 
Lemma 14.91 to relate the length of a low density curve to the diameter of the covered area to the 
number of vertices. One can verify Lemma 14.81 using an easy modification of a lemma from [dBOO]. 
We provide a proof, for the sake of completeness, in Appendix [Bl 

Lemma 4.8 Let it be a 4>-low density curve, of which n edges are intersecting a given hypercube 
C oflR d . The hypercube C can be covered by a set of hypercubes %, such that (i) = C, 
(ii) \%\ < 2 2d+1 n, (Hi) any point p S C is covered by at most 2 d hypercubes, and (iv) each hypercube 
of % intersects at most c<i4> edges of ir, where c d is a constant that depends only on the dimension 
d. 

Lemma 4.9 Let ir be a 4>-low density curve in lR d , and let C be a cube in IR rf with side length r. 
Let a = ||7rnC||. There must be at least f2((a/r) 1+1 /( rf_1 )) vertices of ir contained in 3C, where 
3C is the scaling of C by a factor of 3 around its center. 

Proof: We will first give a lower bound on the number n of edges intersecting C (i.e., the edges 
that contribute to a). Then we will account for the edges that have endpoints outside 3C. So, 
take the n edges of ir that intersect C and construct the cover of C resulting from Lemma [4.81 with 
respect to these edges. 

Let C\, . . . , Cat denote the cubes in this cover, where r\ < r2 < • • • < r^ are the side lengths of 
the cubes used by the cover, respectively. Lemma 14.81 implies that iV < 2 d+l dn, and, therefore, a 
lower bound on iV would provide a lower bound on n. 

So, the sum of the diameters of those N cubes bounds the length of the intersection a < 
J2iLi c-d^^fdri, since every cube in this cover can intersect at most Cd4> edges of ir. Setting p = d 
and q = d/(d — 1), we observe that 1/p + 1/q = 1/d + {d — = 1, and by Holder's inequality, 
we have that 

N N / N \ l / d / N \ l / q (N \ 1 / d 

Y,n = Y,n i< £^M (En =U>? Nid ~ 1)/d - 

i=l i=l \i=l / \i=l / \i=l / 

Lemma 14.81 also implies that the sum of the volumes of the cubes is at most 2 rf vol(C), since 
every point in C is covered at most 2 d times by this cover. Therefore we have that r f = 

£ill vc-KCi) < 2 d vol(C) = 2 d r d . Hence 

N / N \ l / d , 

a<E Cd<pVdr t < cahfd N^/ d < c d <f>Vd(2 d r d ) ' N^l d . 

i=l Vi=l / 

This implies that C2(a / r) d ^ d ~^ < N, where C2 = (2cd4>\fd^ . Since N < 2 2d+1 n, we have 

that c 3 (a/r) d//(d_1) < n, where c 3 = ] +1 (2c d (l)Vctj 

Now, some of these n edges intersecting C can have both endpoints outside 3C Such edges are 
longer than the sidelength of C and by Claim [4T71 their number is bounded by Cd4>- 

Hence, the number of vertices of it inside 3C is at least n — c d (j) > c^(a / r) d ^ d ~ 1 ^ — c d (j). ■ 



holder's inequality states that ELiM'l < (ELi \ a A") l '" (E?=i \ b A P ) 1/P ^ VP+Vfl = 1 
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Remark 4.10 One can also prove Lemma 14.91 directly, by building a quadtree and arguing that 
for a low-density curve to be sufficiently long, many edges in it have to be (sufficiently) short, thus 
implying the same bound. However, the current proof is more intuitive and cleaner. 

Observation 4.11 The bound in Lemma 14.91 is tight. For any m > and any d > 0, consider the 
integer grid in TR d with coordinates in the range 1, . . . , m, and compute a path that visits all these 
grid points using only the grid edges of unit length, which is clearly possible. 

Now, the resulting curve is 2 d -low density and has length a = m d — 1 and its diameter is 
r = \fdm. Lemma 14.91 implies that it has fl((a/r) d /( d ~ 1 ^ = VL(m d } vertices. Since this grid has 
m d vertices, this is tight. 

4.2.2 Accounting for many Reachable Free Space Cells 

If many columns of the free space diagram of the two simplified low density curves contain a linear 
number of reachable cells, then the curve must be "long" in the vicinity of the edges corresponding 
to those columns, since the simplification ensures a minimal edge length. A similar argument holds 
for the rows. Therefore, using Lemma 14,9^ we can charge the additional reachable cells to vertices 
of the original curves. This yields the following result. 

Lemma 4.12 For any two low density curves ir and a in TR d , and < e < 1, we have that 
N( W ) = 0(n!^). 

Proof: Let 5 > be an arbitrary radius, and let ir 1 = simpl(7r, //) and a' = simpl(<T, /j.) be their 
simplifications, where ji = eS. Then, we need to prove that N<$(ir', a') = o(^ n ( g2 — ^ . 

To this end, it suffices to bound the number of vertex-edge pairs (p, u), where p is a vertex of 
ir' , u is an edge of a' , and the distance between p and u is at most 5 (naturally, we need to apply 
the same argument to pairs with vertices in a 1 and edges in 7r'). The total number of such pairs 
bounds the total complexity of = ^R<s(tt', a'). 

Set M = 0(n 1 ~ 2 / d /e 2 ) , and associate every vertex-edge pair (p, u) that appears in the free 
space diagram 3i< ( 5 with the vertex p. 

Consider the grid 9 of side length 5. For a grid cell R, consider the 
vertex of ir' in R that is associated with the largest number of such 
vertex-edge pairs, and say it is being associated with cIr such vertex- 
edge pairs, and let vr denote this "popular" vertex of ir'. The total 
number of vertex-edge pairs associated with vertices of ir' inside R is 
bounded by Ur = \ir' n R| cIr, where \ir' n R| denotes the number of 
vertices of ir' that lie inside R. 

If cIr < M then C/r < \ir' n R| M, and we charge M units to each 
vertex of ir inside R. 

If cIr > M then the length of a' inside C/3 is at least cIr/^, where C is a cube centered at R with 
side length 0(5). Indeed, all the charges cIr rise from different segments of a' that are in distance 
at most 5 from vr, and each such segment has length at least [i. 

By Lemma [4.9[ we have that a must have at least n((d R fi/5) d ^ d ^) = Q({d R e) d ^ d ~^) vertices 
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inside C. There is some constant c such that 



dS<|.n C |>r^) 1 - 2/ % ; Ln 1 - 2 'VnC|<M|.nC|, 



by picking M to be sufficiently large. In particular, if \tt' n R| < cJr, then C/r = |7r' n R| cIr < dp < 
M\a n C| . Hence, we charge M units to each vertex of a inside the cube C. 

Otherwise, \tt' n R| > cIr > M. But then, the length of tt' inside C is at least \tt' n R| /i, and by 
Lemma 14.91 we have that tt must have at least f2((|7r' n R| e) a! /( rf_1 )) vertices inside C. Arguing as 
above, this implies that \tt' n R| 2 < M\tt n C| . As such, we have that C/r = |tt' n R| d R < \tt' n R| 2 < 
M|tt n C| . Again, we charge M units to each vertex of tt inside the cube C. 

Since C intersects a constant number of cells of the grid, no vertex would get charged more 
than a constant number of times by the above scheme. Thus, every vertex, of either curve, gets 
charged O(M) units overall, and the total number of vertex-edge pairs present in is 0(nM), 
as claimed. ■ 

Observation 4.13 One can extend the construction of Observation 14. 1 ll to show that Lemma [4. 121 
is close to being tight. Indeed, consider the grid curve of Observation 14. Ill in d— 1 dimensions, for 
an integer m. We now lift it to d dimensions by considering the [1, m] d cube and placing two copies 
of the above curve on two opposite faces of the cube, denoted by / and /'. Let tt\ and tti denote 
these two copies. 

Next, delete the even edges from tt\ and the odd edges from tt2- Connect every vertex v\ of ir\ to 
its corresponding (copied) vertex V2 in TT2 by a path made out of the m — 1 unit edges along the grid 
line connecting the two vertices. This results in a curve tt that is similar to the curve constructed in 
Observation 14. Ill but has the advantage that when simplified for the distance fi = m it results in a 
curve with m d ~ 1 segments of length > m that connects points that lie on / and on /', respectively. 

Let a be a copy of tt. For a fixed e > 0, we can add a single segment to it such that the Frechet 
distance between the resulting curves is exactly 5 = m/e. Now, these two curves have n = 2m d + 2 
vertices overall, and furthermore, when we simplify them for the distance fi = e5 = m, we end up 
with two curves such that every long edge of tt' is going to be in distance < 5 = m/e from a constant 
fraction of the edges of a' (this would be all the edges if 1/e > Vd). Therefore the complexity of 
the reachable free space is f^rviv) = Q ^(to ^ 1 ) 2 ^ = 0,(n 2 ^ d ~ 1 ^ d ^ , where rv denotes the number 
of vertices of tt'. The upper bound of Lemma 14.121 is (only) larger by a factor of 0(l/e 2 ). 

By plugging the above into Theorem 13. 161 we get the following result. 

Theorem 4.14 Given two low-density curves tt and a with a total of n vertices in IR d , and a 

parameters > 0, there exists an algorithm which (1 + e)- approximates the Frechet distance between 

f n 2{d-i)/d \ 
tt and a in \ k h n 2<yd ~ l ^ d log n time. 

V e J 

4.3 Relative Free Space Complexity of /t-Bounded Curves 

We revisit the definitions of Alt et al. |AKW 04j of K-bounded and ^-straight curves. Note that 
these definitions describe an extremely restricted class of curves while c-packed curves form a fairly 
general and natural class of curves. However, it is not true that any K-bounded curve is 0(«;)-packed. 
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V 



Figure 4: Koch's snowflake is an example of a K-bounded curve that has infinite length but a finite 
diameter. 

We therefore give a separate proof to bound the relative free space complexity of K-bounded curves 
in order to improve upon the result in |AKW04| . 

Definition 4.15 Let k > 1 be a given parameter. A curve tt is K-straight if 
for any two points p and q on the curve, it holds that [| vr[p, g]|| < k \\p — q\\. 

A curve tt is a K-bounded if for all p, q 6 tt it holds that the curve ir[p, q] 
is contained inside B(p, r) U B(q, r), where r = (k/2) \\p — q\\, see the figure on 
the right. 

Lemma 4.16 A K-straight curve is 2n-packed. 

Proof: Let tt be a K-straight curve in TR d , and consider any ball B(p,r) that intersects it. Let q 
and s be the first and last points, respectively, along tt that are in B(p,r). Clearly, \\q — s\\ < 2r, 
and by the K-straightness ||-7r H B(p, r)|| < \\n[q, s]\\ < n\\q — s\\ <2kt. ■ 

Remark 4.17 It is easy to verify that a K-straight curve is also K-bounded. However, K-bounded 
curves, counterintuitively, can have infinite length even when contained inside a finite domain. An 
example of this is Koch 's snowflake , which is a fractal curve depicted in Figure [H 

To see, intuitively, why Koch's snowflake is K-bounded, let 7r, be the ith polygonal curve gen- 
erated by this process. There is a natural mapping between any point of iti and 7Tj+i, for all i. 
In particular, consider two points p and q on the final curve n* , and consider the two sequences 
of points Pi,qi G 7Tj, where pi+i E fti+i (resp. qi+\ G vri+i) is the natural image of pi (resp. qi), 
lim^oo pi = p, and lim^oo % = q. 

Now, assume that r = \\p — q\\. Observe that, for all i, the polygonal curve 7Tj is made out of 
segments that are all of the same length. In particular, consider the first index k, such that this 
segment length of 7Tfc is of length < r/20. It is easy to argue that | \pk — p \\ < r/5 and \\qk — q\\ < r/5 . 
In fact, one can argue that no point of tt/. moves more than a distance larger than r/5 to its final 
location on tt* . 

Now, a tedious argument shows that there are O(l) segments of separating p^ from q^. 
Therefore this portion of the curve ir^ is covered by a disk of radius 0(r), and the corresponding 
portion of the final curve between p and q is also covered by a disk of radius 0(r). This implies 
that Koch's snowflake is K-bounded. 

A formal proof of this fact is considerably more tedious and is omitted. 

Lemma 4.18 Let tt be a K-bounded polygonal curve in IR d , and let [i < 5 be parameters. Let tt' = 
simpl(7r, n). Then the number of segments of tt' intersecting B(s, 5) is bounded by O (^K d (1 + 5 / fi) d ^ , 
for any s G TR d . 
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Proof: For ir = U1U2 . . . Ufe, let Yo = {ui, U3, . . .} and Ye = {U2, U4, . . .} be the sets of odd and even 
segments of ir', respectively. 

Let Xo Q Yo be the set of odd segments of ir' intersecting B(s, 5). For all i, pick an arbitrary 
point pi on the ith segment of Xo that lies inside B(p,8). Next, pick an original point qi of it 
in distance at most \x from pi, for i = 1,...,M = \Xo\- Observe, that for all i we have that 
\\s — qi\\ < 5 + fx. Furthermore, between any two distinct points pi and pj on the simplified curve 
it' there must lie an even segment of Ye in between them along the curve, and the length of this 
segment is at least n (because the simplification algorithm generates segments of length at least 
n). Also, the endpoints of this even segment lie on the original curve ir. 

We claim that no two points of Q = {q±, . . . , (/m} can be too close to each other; that is, there 
are no two points q',q" 6 Q, such that r = \\q' — q"\\ < ^//(4/t). Indeed, assume for the sake of 
contradiction, that there are two such points. Then, by the above, the portion of ir connecting them 
contains two points t',t" that are at least [i apart. Observe that ir[t',t"] C X = B(q', (n/2)r) U 
B(q", (k/2)t). However, the maximum distance between two points that are included inside X is 
bounded by its diameter. We have that 

H < \\t - t'\\ < diam(X) = 2(n/2)r + \\q' - q"\\ < ^ + ^ < |, 

since k > 1. A contradiction. 

However, all the points of Q lie inside a ball of radius 5 + fx centered at s. Now, placing 
a ball of radius fj! = fj,/(8n) around each point of Q, results in a set of interior disjoint balls. 
This implies, by a standard packing argument, that the number of points of Q is bounded by 
vol(B(s, 5 + ft)) /vol(B(; //)) =0{(5+ *a) d , \y. / ' nf) = ((1 + 5/fi) d K d ) . 

This bounds the number of odd segments of ir' intersecting the ball B(s,5), and a similar 
argument holds for the even segments intersecting this ball. ■ 

Lemma 4.19 For any two n-bounded polygonal curves in IR 6 ' ir and a, < e < 1, we have 
N(e,7r,(j) = 0((K/e) d n). 

Proof: Let 5 > be an arbitrary radius, and set \x = eS. Let ir' = simpl(7r, fj>) and a' = simpl(<7, /x). 
We need to show that the complexity of the reachable free space Jl<s(ir', a') is 0(K d {l + 5//j,) d n) = 
0({K/e) d n) . 

The boundary of a reachable cell in the free space diagram has a non-empty intersection with 
D<s(ir' , o~'). Otherwise its interior could not be reached by a monotone path from (0, 0). Therefore, 
using an argument similar to the proof of Lemma 14.41 Lemma 14. 181 implies the desired bound. ■ 

By plugging the above into Theorem 13. 161 we get the following result. 

Theorem 4.20 Given two n-bounded polygonal curves ir and a with a total of n vertices in 1R d , 
and a parameter 1 > e > 0, there exists an algorithm which (1 + e) -approximates the Frechet 
distance between ir and a in o(^(k/ 'e) d n + K d n\ogn^ time. 

5 Extension to Closed c-packed Curves 

The Frechet distance for closed curves is defined as in Section [2] with the altered condition that the 
reparameterizations / and g are orientation-preserving homeomorphisms on the one-dimensional 
sphere. Computing the Frechet distance for closed curves is more difficult, as the constraint that 
the endpoints of the curves have to be matched to each other is dropped in this case and therefore 
the set of reparameterizations one has to consider is larger. 
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Observation 5.1 The decision problem for closed curves can be reduced to the previously con- 
sidered case of open curves. Given two closed c-packed curves tt and a and a parameter 5. Pick 
a vertex p of the curve tt, and assume that we know a point q on a that is being matched to p 
by a pair of reparameterizations of tt and a of width at most 5. Clearly, if we break tt open at p, 
and a at q, we retrieve two open curves tt and a, and we can use the previous method to decide if 
dj(jt, a) < 5. Hence we only need to generate a suitable set of candidates for q to determine if the 
Frechet distance between tt and a is at most 5 within a certain approximation error. 

Lemma 5.2 Let tt be a closed c-packed polygonal curve in TR d , and let /i < 8 be parameters. Let 
tt' = simpl(7r,^). Then the number of edges of tt' intersecting B{p,5) is bounded by 0(c5/[i), for 
any p G Ji d . 

Proof: Consider the ball B = B(p, r) of radius r = [i + 5. Any edge u of tt' that intersects B(p, 5) 
has to contribute at least \x to the length of the intersection with B, as the simplification guarantees 
that every edge of tt' is of length at least /i. Since tt' is 6c-packed, by Lemma 14.31 we have that 
\\B n tt'\\ < 6cr, and the number of intersections of tt' with B(p, 5) is N < \\B n tt'\\ /fx < 6cr//i = 
6c(// + 5)/[i = 0(c + c5/ ft), which implies the claim. ■ 

Lemma 5.3 Given two closed c-packed polygonal curves tt and a with a total number of n vertices 
and parameters 5 and 1 > e > 0. Let tt' = simpl(-7r, fi) and a' = simpl(<r, /i) denote the curves 
simplified with \i < e5 and let p be a vertex of tt' . We can compute a set of points % C a' of size 
0{c/e), in 0{n + c/e) time, such that if dj{Tr' , a 1 ) < 6 then there exists a pair of reparameterizations 
of width at most (1 + e)8 that matches p to an element of%. 

Proof: We walk along the curve a' starting from an arbitrary point. If the starting point is in 
distance 5 from p, then we add it to the candidate set %. As we follow along the curve we create 
a candidate if we 

(a) (re-)enter the ball B(p,5), or 

(b) have traveled a distance e5 along a' since the last creation of a candidate, unless we have 
exited the ball B(p, 5) in the meantime. 

Clearly this takes 0(n + |3C|) time. 

The number of events of type (a) is bounded (up to a factor of 2) by the number of intersections 
of a' with the sphere S(p,5), and by Lemma l5.2( this number is bounded by 0(c5/fi) = 0{c/e). 
By Lemma 14.31 the simplified curve a' is 6c-packed and therefore the length of its intersection with 
B(p, 5) is at most 6c5. This implies that we can have at most 0(6c5//j.) = 0(6c/e) candidates that 
were created at events of type (b). 

Consider reparameterizations of tt' and a' of width at most 5. 
Next, consider a point q € a' that is matched to p € tt' by these 
reparameterizations. Observe that q S B{p,5) and there exists, by 
construction, a point q' £ % such that \\q — q'\\ < eS. Let p' be a 
point on tt' that is matched to q' by the given reparameterizations. 

We match the curve segment a between q and q' to p and the 
curve segment tt between p and p' to q, see the figure to the right. 
Clearly this preserves the monotonicity of the matching. By the 
triangle inequality, any point on a has distance at most (1 -\-e)8 to 
p. Similarly, for any point on tt there is a point on a that is in distance 5, therefore q' is in distance 
(l + e)<5 of tt. 

We conclude that the Frechet distance between tt' and a' is at most (1 + e)5 when restricted to 
reparameterizations matching p to q'. ■ 
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One can adapt Lemma 13.51 to the closed curves case, by considering the 0(cn/e) open curves 
that result from breaking a 1 at any point of %. The details of the adaption are straightforward, 
and we only state the result. 

Lemma 5.4 Given two closed polygonal c-packed curves tt and a with a total of n vertices in H d , 
and parameters 5 and 1 > e > 0. Then, there exists an algorithm which, in 0((c/e) 2 n) time, 
correctly outputs one of the following: 

(A) Ifd^(-K,a) < 5 then the algorithm outputs "< (l + e)5". 

(B) Ifd^(n,a) > (l + e)5 then the algorithm outputs u d^(iT,o-) > 5". 

(C) Ifdj(ir,a) € 5,(1 + e) 6 then the algorithm outputs either of the above outcomes. 
Plugging Lemma 15.41 into the algorithm of Theorem 13.161 we get the following result. 

Theorem 5.5 Given two closed polygonal c-packed curves tt and cr with a total of n vertices in 
JR d , and a parameter 1 > e > 0, one can (1 + e)- approximate the Frechet distance between tt and 
a in 0(c 2 n(e -2 + log n)) time. 



6 Conclusions 

We presented a new approximation algorithm for the Frechet distance for polygonal curves in any 
fixed dimension. The new algorithm is surprisingly simple and should be practical. Furthermore 
it works for any kind of polygonal curves. Since the algorithm simplifies the curves to the "right" 
resolution during the execution, we expect the algorithm to be fast in practice. The algorithm's 
analysis relies on the concept of the relative free space complexity of curves, which tries to capture 
the complexity of the free space diagram when simplification is being used. 

Next, we introduced the c-packed family of curves. While not all curves are c-packed, it seems 
that most real life curves are c-packed. The family of c-packed curves is closed under simplification, 
and the property of a curve being c-packed is independent of the ambient dimension of the space 
containing the curve. We expect this concept to be used to analyze other algorithms in the future. 

In particular, the relative free space complexity of c-packed curves is linear. We gave bounds 
for the relative free space complexity for several other types of curves, from low density curves to 
ft-bounded curves. Finally, we also showed that the algorithm can be modified to handle closed 
curves efficiently. 



Lower bound. Our solution to the decision problem "beats" the lower bound of O(nlogn) 
[BBK + 07j , by a factor of log n (see Lemma 13. 5p . Since our decision procedure is approximated 
this is not too surprising. However, it is enlightening to consider where this proof breaks for our 
settings. Indeed, Buchin et al. [BBK + 07] generate two curves such that the Frechet distance might 
be realized by one vertex of one curve matching the whole other curve. On the other hand, in 
our case, the input model coupled with simplification, guarantees that the number of segments 
matching a single vertex is only a constant. 
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A Fatness implies c-packedness 

We show that the boundary of an (a, /3)-covered shape is c-packed even if the shape does not have 
a finite descriptive complexity. A somewhat similar result (which however is too weak to prove this 
result) is the packing lemma of de Berg [dB08j that shows that the boundary of the union of 7-fat 
shapes has low density. This implies that a connected component of this boundary has low density. 

As mentioned in the introduction, since Koch's snowflake is 7-fat, if the finiteness requirement 
is removed, it follows that the boundary of 7-fat shapes with unbounded descriptive complexity are 
not c-packed, for any finite c. 

Definition A.l A bounded simply connected region P in the plane is (a, (3) -covered if for each 
point p £ dP, there exists a triangle T p , called a good triangle of p, such that: (i) p is a vertex of 
T p , (ii) T p C P, (iii) all the angles of T p are at least a, and (iv) the length of all the edges of T p is 
at least /3diam(P). 



2G 



Note, that our definition is different from the standard definition of (a, /3)-covered shapes, since 
we do not require that the region P has a finite descriptive complexity. 



Lemma A. 2 Let S be a set of segments contained inside a disk with radius r, such that for any 
point p lying on a segment of S, there is an infinite cone V of angle at least a < tt with an apex at 
p, such that the intersection of the interior ofV with S is empty. Then, \\S\\ < 107rr/(asin(a/4)). 

Proof: Let J 7 be a family of \2n/(a/2)~\ cones, centered at the origin, such that they cover all 
directions, and each cone has angle a/2. Clearly, for any point p lying on a segment of S, there 
must be a cone V £ J-, such that the interior of p + V does not intersect S. We will say that p is 
exposed by V. 

So, fix such a cone V 6 J and consider the direction v that 
splits the angle of V into two. Rotate the plane such that v is the 
direction of the negative y axis, and observe that any point of S 
that is exposed by (the rotated) V lies on the lower envelope of the 
segments of S. Furthermore, the segment u £ S that contains this 
point must have an angle in the range (— ir/2 + a/4, tt/2 — a/4) 
with the positive direction of the x-axis (we assume u is oriented 
from left to right). 

Now, since the projection of S on the a>axis has length at most 
2r, it follows that the total length of the segments exposed by V is 
at most 2r J sin(a/4). 

Hence, the total length of segments of S is bounded by 




sin(a/4) J \a J \sin(a/4) J asin(a/4) 

Lemma A. 3 If P is an {a, (3) -covered polygon in the plane then it is c-packed, for 

1 



O 



a/3 sin(a/4) tan(a) / 



Proof: Let S = dP, and consider any disk D of radius r in the plane. Observe that the height 
of a good triangle is at least p = (s/2)tan(a), for s = /3diam(P), and this also bounds the 
distance of any vertex of a good triangle to its facing edge. If r < p/2, then any good triangle 
for a point of S behaves like a cone as far as S n D is concerned, and Lemma IA.2I implies that 
\\S H D\\ < I0irr/(a sin(a/4)) as desired. 

If r < diam(P), then cover D by m = (2^/2r/p+ l) 2 disks of radius p/2. Clearly, for each 
such disk, the total length of segments of S inside it, by Lemma [A. 21 is at most 5irp/(a sin(a/4)). 
Therefore the total length of S inside D is 

5tto ( 2y/2r \ 2 160vr r 320vr 
— hi < • - • r < r. 

(asin(a/4)) \ p I asin(a/4) p a/3 sin (a/4) tan (a) 

Observe that the total length of dP is bounded by the above bound, by taking D to be a disk 
of radius r = diam(i-*) centered at some point of P. Therefore the claim trivially holds in the case 
r > diam(P). ■ 
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B A proof of Lemma 14.81 



Lemma B.l Let P be a set of n points in TR d , contained inside a hypercube C . Then one can 
cover C by a set of cubes %, such that the following properties hold. 

(A) \JX = C. 

(B) \%\ < 2 d+1 dn. 

(C) Each p G C is covered by at most 2 d cubes. 

(D) Each cube contains at most one point from P. 

Proof: We can use the following algorithm to construct a (i-dimensional reduced quadtree, of which 
the set of cubes corresponding to the leaf nodes satisfies the requirements for %. 

Take C as the root node. Split the current node recursively into 2 d 
subcubes, until there is only one point left in the current node, while abiding > 
to the following rule. I ■ . 

In each step, either (A) do a proper quadtree split if at least two of . — I 

the immediately resulting subcubes contain a point of P, or (B) perform a t • 

reduced split otherwise, such that all points are contained in exactly one 1 — _ __] 

minimal subcube. A reduced split is formed by allowing the cubes the cube 
to overlap, by shrinking one of the 2 d subcubes containing the points, and 

enlarging all the others. Such a reduced split is depicted in the figure to the right. However, we can 
assure that only those subcubes will overlap that do not contain any point of P and will therefore 
not be split further. Clearly each point in the covered area is covered by at most 2 d leaf nodes. 

A split of type (A) separates the set of points into at least two non-empty subsets. A split of 
type (B) results in a point on the splitting plane. Both events can happen at most dn times and 
produce each at most 2 d extra nodes. Therefore the size of % is bounded by 2 d+l dn. ■ 

Proof of Lemma \4-8{ ' This follows directly from Lemma IB. II Indeed, for every edge of ir add 
the corners of the axis parallel cube containing it to a set of points P. Next, consider the respective 
quadtree construction of Lemma IB. II for P C C. The cover uses at most 2 d+1 m boxes, where 
m<2 d n = |P|. 

Consider a cube C in the resulting decomposition of C, and an edge u of ir that intersects it. If 
the length of u is shorter than the sidelength of C, then one of the corners of the bounding cube of 
u must be in C", and C cannot be a leaf of the quadtree. This implies that C' can be intersected 
only by edges that are at least as long as its sidelength. 

By the low density property of ir and by Claim 14.71 C can intersect at most Cd4> edges of tt, 
which implies the lemma. ■ 
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